Good Neighbors Make Good Senses: Exploiting Distributional Similarity for Unsupervised WSD
نویسندگان
چکیده
We present an automatic method for senselabeling of text in an unsupervised manner. The method makes use of distributionally similar words to derive an automatically labeled training set, which is then used to train a standard supervised classifier for distinguishing word senses. Experimental results on the Senseval-2 and Senseval-3 datasets show that our approach yields significant improvements over state-of-the-art unsupervised methods, and is competitive with supervised ones, while eliminating the annotation cost.
منابع مشابه
PUTOP: Turning Predominant Senses into a Topic Model for Word Sense Disambiguation
We extend on McCarthy et al.’s predominant sense method to create an unsupervised method of word sense disambiguation that uses automatically derived topics using Latent Dirichlet allocation. Using topicspecific synset similarity measures, we create predictions for each word in each document using only word frequency information. It is hoped that this procedure can improve upon the method for l...
متن کاملNeighbors Help: Bilingual Unsupervised WSD Using Context
Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recen...
متن کاملFrom Predicting Predominant Senses to Local Context for Word Sense Disambiguation
Recent work on automatically predicting the predominant sense of a word has proven to be promising (McCarthy et al., 2004). It can be applied (as a first sense heuristic) to Word Sense Disambiguation (WSD) tasks, without needing expensive hand-annotated data sets. Due to the big skew in the sense distribution of many words (Yarowsky and Florian, 2002), the First Sense heuristic for WSD is often...
متن کاملBootstrapping Without the Boot
What: We like minimally supervised learning (bootstrapping). Let’s convert it to unsupervised learning (“strapping”). How: If the supervision is so minimal, let’s just guess it! Lots of guesses lots of classifiers. Try to predict which one looks plausible (!?!). We can learn to make such predictions. Results (on WSD): Performance actually goes up! (Unsupervised WSD for translational senses, Eng...
متن کاملAn Enhanced Lesk Word Sense Disambiguation Algorithm through a Distributional Semantic Model
This paper describes a new Word Sense Disambiguation (WSD) algorithm which extends two well-known variations of the Lesk WSD method. Given a word and its context, Lesk algorithm exploits the idea of maximum number of shared words (maximum overlaps) between the context of a word and each definition of its senses (gloss) in order to select the proper meaning. The main contribution of our approach...
متن کامل